{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!guardrails hub install hub://guardrails/toxic_language"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Check whether an LLM-generated response contains toxic language\n",
    "\n",
    "### Using the `ToxicLanguage` validator\n",
    "\n",
    "This is a simple walkthrough of the `ToxicLanguage` validator. This validator checks whether an LLM-generated response contains toxic language. It uses the pre-trained multi-label model from HuggingFace -`unitary/unbiased-toxic-roberta` to check whether the generated text is toxic. It supports both full-text-level and sentence-level validation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import the guardrails package\n",
    "# and the ToxicLanguage validator\n",
    "# from Guardrails Hub\n",
    "import guardrails as gd\n",
    "from guardrails.hub import ToxicLanguage\n",
    "from rich import print"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a Guard object with this validator\n",
    "# Here, we'll use the default validation method of \"sentence\"\n",
    "# and the default threshold of 0.25\n",
    "\n",
    "guard = gd.Guard.from_string(\n",
    "    validators=[ToxicLanguage(on_fail=\"fix\")],\n",
    "    description=\"testmeout\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "What a lovely day it has been to roam around the city.\n",
       "I love it here!\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "What a lovely day it has been to roam around the city.\n",
       "I love it here!\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Parse the raw response\n",
    "raw_response = \"\"\"\n",
    "What a lovely day it has been to roam around the city. I love it here!\n",
    "I hate how pathetic and needy you can be sometimes.\n",
    "\"\"\"\n",
    "raw_llm_output, validated_output, *rest = guard.parse(\n",
    "    llm_output=raw_response,\n",
    ")\n",
    "\n",
    "# Print the output\n",
    "print(validated_output)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see here, the sentence `I hate how pathetic and needy you can be sometimes.` has been detected as toxic, and hence removed from the validated output.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "I am a Generative AI model that is trained on a large corpus of text.\n",
       "This is a very powerful tool for generating new text, but it can also be used to generate text that is offensive or\n",
       "hateful.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "I am a Generative AI model that is trained on a large corpus of text.\n",
       "This is a very powerful tool for generating new text, but it can also be used to generate text that is offensive or\n",
       "hateful.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Parse the raw response\n",
    "raw_response = \"\"\"\n",
    "I am a Generative AI model that is trained on a large corpus of text.\n",
    "I am shocked by how disgusting and vile you are.\n",
    "This is a very powerful tool for generating new text, but it can also be used to generate text that is offensive or hateful.\n",
    "\"\"\"\n",
    "raw_llm_output, validated_output, *rest = guard.parse(\n",
    "    llm_output=raw_response,\n",
    ")\n",
    "\n",
    "# Print the output\n",
    "print(validated_output)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, here the sentence `I am shocked by how disgusting and vile you are.` has been detected as toxic, and hence removed from the validated output.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test with validation method 'full'\n",
    "full_guard = gd.Guard.from_string(\n",
    "    validators=[ToxicLanguage(validation_method=\"full\", on_fail=\"fix\")],\n",
    "    description=\"testmeout\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Parse the raw response\n",
    "raw_response = \"Stop being such a dumb piece of shit. Why can't you comprehend this?\"\n",
    "raw_llm_output, validated_output, *rest = full_guard.parse(\n",
    "    llm_output=raw_response,\n",
    ")\n",
    "\n",
    "# Print the output\n",
    "print(validated_output)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we're doing validation on the entire text, and toxic language was detected here - hence, the nothing is returned here.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "lang",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
